在人类机器人的相互作用中,眼球运动在非语言交流中起着重要作用。但是,控制机器人眼的动作表现出与人眼动物系统相似的性能仍然是一个重大挑战。在本文中,我们研究了如何使用电缆驱动的驱动系统来控制人眼的现实模型,该系统模仿了六个眼外肌肉的自由度。仿生设计引入了解决新的挑战,最值得注意的是,需要控制每种肌肉的支撑,以防止运动过程中的紧张感损失,这将导致电缆松弛和缺乏控制。我们构建了一个机器人原型,并开发了一个非线性模拟器和两个控制器。在第一种方法中,我们使用局部衍生技术线性化了非线性模型,并设计了线性 - 季度最佳控制器,以优化计算准确性,能量消耗和运动持续时间的成本函数。第二种方法使用复发性神经网络,该神经网络从系统的样本轨迹中学习非线性系统动力学,以及一个非线性轨迹优化求解器,可最大程度地减少相似的成本函数。我们专注于具有完全不受限制的运动学的快速saccadic眼球运动,以及六根电缆的控制信号的生成,这些电缆同时满足了几个动态优化标准。该模型忠实地模仿了人类扫视观察到的三维旋转运动学和动力学。我们的实验结果表明,尽管两种方法都产生了相似的结果,但非线性方法对于未来改进该模型的方法更加灵活,该模型的计算是线性化模型的位置依赖性偏向和局部衍生物的计算变得特别乏味。
translated by 谷歌翻译
Matrix factorization exploits the idea that, in complex high-dimensional data, the actual signal typically lies in lower-dimensional structures. These lower dimensional objects provide useful insight, with interpretability favored by sparse structures. Sparsity, in addition, is beneficial in terms of regularization and, thus, to avoid over-fitting. By exploiting Bayesian shrinkage priors, we devise a computationally convenient approach for high-dimensional matrix factorization. The dependence between row and column entities is modeled by inducing flexible sparse patterns within factors. The availability of external information is accounted for in such a way that structures are allowed while not imposed. Inspired by boosting algorithms, we pair the the proposed approach with a numerical strategy relying on a sequential inclusion and estimation of low-rank contributions, with data-driven stopping rule. Practical advantages of the proposed approach are demonstrated by means of a simulation study and the analysis of soccer heatmaps obtained from new generation tracking data.
translated by 谷歌翻译
We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.
translated by 谷歌翻译
One of the major challenges in Deep Reinforcement Learning for control is the need for extensive training to learn the policy. Motivated by this, we present the design of the Control-Tutored Deep Q-Networks (CT-DQN) algorithm, a Deep Reinforcement Learning algorithm that leverages a control tutor, i.e., an exogenous control law, to reduce learning time. The tutor can be designed using an approximate model of the system, without any assumption about the knowledge of the system's dynamics. There is no expectation that it will be able to achieve the control objective if used stand-alone. During learning, the tutor occasionally suggests an action, thus partially guiding exploration. We validate our approach on three scenarios from OpenAI Gym: the inverted pendulum, lunar lander, and car racing. We demonstrate that CT-DQN is able to achieve better or equivalent data efficiency with respect to the classic function approximation solutions.
translated by 谷歌翻译
Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic language aspects, like irony and nuance. To accomplish this task, one must provide a robust numerical representation for documents, a process known as embedding. Embedding represents a key NLP field nowadays, having faced a significant advance in the last decade, especially after the introduction of the word-to-vector concept and the popularization of Deep Learning models for solving NLP tasks, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer-based Language Models (TLMs). Despite the impressive achievements in this field, the literature coverage regarding generating embeddings for Brazilian Portuguese texts is scarce, especially when considering commercial user reviews. Therefore, this work aims to provide a comprehensive experimental study of embedding approaches targeting a binary sentiment classification of user reviews in Brazilian Portuguese. This study includes from classical (Bag-of-Words) to state-of-the-art (Transformer-based) NLP models. The methods are evaluated with five open-source databases with pre-defined data partitions made available in an open digital repository to encourage reproducibility. The Fine-tuned TLMs achieved the best results for all cases, being followed by the Feature-based TLM, LSTM, and CNN, with alternate ranks, depending on the database under analysis.
translated by 谷歌翻译
A major challenge in machine learning is resilience to out-of-distribution data, that is data that exists outside of the distribution of a model's training data. Training is often performed using limited, carefully curated datasets and so when a model is deployed there is often a significant distribution shift as edge cases and anomalies not included in the training data are encountered. To address this, we propose the Input Optimisation Network, an image preprocessing model that learns to optimise input data for a specific target vision model. In this work we investigate several out-of-distribution scenarios in the context of semantic segmentation for autonomous vehicles, comparing an Input Optimisation based solution to existing approaches of finetuning the target model with augmented training data and an adversarially trained preprocessing model. We demonstrate that our approach can enable performance on such data comparable to that of a finetuned model, and subsequently that a combined approach, whereby an input optimization network is optimised to target a finetuned model, delivers superior performance to either method in isolation. Finally, we propose a joint optimisation approach, in which input optimization network and target model are trained simultaneously, which we demonstrate achieves significant further performance gains, particularly in challenging edge-case scenarios. We also demonstrate that our architecture can be reduced to a relatively compact size without a significant performance impact, potentially facilitating real time embedded applications.
translated by 谷歌翻译
Shape displays are a class of haptic devices that enable whole-hand haptic exploration of 3D surfaces. However, their scalability is limited by the mechanical complexity and high cost of traditional actuator arrays. In this paper, we propose using electroadhesive auxetic skins as a strain-limiting layer to create programmable shape change in a continuous ("formable crust") shape display. Auxetic skins are manufactured as flexible printed circuit boards with dielectric-laminated electrodes on each auxetic unit cell (AUC), using monolithic fabrication to lower cost and assembly time. By layering multiple sheets and applying a voltage between electrodes on subsequent layers, electroadhesion locks individual AUCs, achieving a maximum in-plane stiffness variation of 7.6x with a power consumption of 50 uW/AUC. We first characterize an individual AUC and compare results to a kinematic model. We then validate the ability of a 5x5 AUC array to actively modify its own axial and transverse stiffness. Finally, we demonstrate this array in a continuous shape display as a strain-limiting skin to programmatically modulate the shape output of an inflatable LDPE pouch. Integrating electroadhesion with auxetics enables new capabilities for scalable, low-profile, and low-power control of flexible robotic systems.
translated by 谷歌翻译
Persistent homology, a powerful mathematical tool for data analysis, summarizes the shape of data through tracking topological features across changes in different scales. Classical algorithms for persistent homology are often constrained by running times and memory requirements that grow exponentially on the number of data points. To surpass this problem, two quantum algorithms of persistent homology have been developed based on two different approaches. However, both of these quantum algorithms consider a data set in the form of a point cloud, which can be restrictive considering that many data sets come in the form of time series. In this paper, we alleviate this issue by establishing a quantum Takens's delay embedding algorithm, which turns a time series into a point cloud by considering a pertinent embedding into a higher dimensional space. Having this quantum transformation of time series to point clouds, then one may use a quantum persistent homology algorithm to extract the topological features from the point cloud associated with the original times series.
translated by 谷歌翻译
Only increasing accuracy without considering uncertainty may negatively impact Deep Neural Network (DNN) decision-making and decrease its reliability. This paper proposes five combined preprocessing and post-processing methods for time-series binary classification problems that simultaneously increase the accuracy and reliability of DNN outputs applied in a 5G UAV security dataset. These techniques use DNN outputs as input parameters and process them in different ways. Two methods use a well-known Machine Learning (ML) algorithm as a complement, and the other three use only confidence values that the DNN estimates. We compare seven different metrics, such as the Expected Calibration Error (ECE), Maximum Calibration Error (MCE), Mean Confidence (MC), Mean Accuracy (MA), Normalized Negative Log Likelihood (NLL), Brier Score Loss (BSL), and Reliability Score (RS) and the tradeoffs between them to evaluate the proposed hybrid algorithms. First, we show that the eXtreme Gradient Boosting (XGB) classifier might not be reliable for binary classification under the conditions this work presents. Second, we demonstrate that at least one of the potential methods can achieve better results than the classification in the DNN softmax layer. Finally, we show that the prospective methods may improve accuracy and reliability with better uncertainty calibration based on the assumption that the RS determines the difference between MC and MA metrics, and this difference should be zero to increase reliability. For example, Method 3 presents the best RS of 0.65 even when compared to the XGB classifier, which achieves RS of 7.22.
translated by 谷歌翻译
人工神经网络的扩展不断增加,在超功率边缘设备上不会停止。但是,这些通常具有很高的计算需求,并且需要专门的硬件加速器,以确保设计达到功率和性能限制。神经网络的手动优化以及相应的硬件加速器可能非常具有挑战性。本文介绍了Hannah(硬件加速器和神经网络搜索),该框架是针对深神经网络和硬件加速器的自动化和组合的硬件/软件共同设计,用于资源和功率受限的边缘设备。优化方法使用基于进化的搜索算法,一种神经网络模板技术以及可配置的Ultratrail硬件加速器模板的分析KPI模型,以找到优化的神经网络和加速器配置。我们证明,汉娜(Hannah)可以找到适合不同音频分类任务的功耗和高精度的合适神经网络,例如单级唤醒单词检测,多级关键字检测和语音活动检测,这些操作优于相关工作。
translated by 谷歌翻译